-
-
Notifications
You must be signed in to change notification settings - Fork 33.5k
GH-140638: Add a GC "candidates" stat #141814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pablogsal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 🚀
I assumed that visited is not initialized because IIRC the interpreter struct is calloc-ed. I'm on my phone so is not easy to double check but just a thing to check before landing
Python/gc.c
Outdated
| add_stats(GCState *gcstate, int gen, struct gc_collection_stats *stats) | ||
| { | ||
| gcstate->generation_stats[gen].duration += stats->duration; | ||
| gcstate->generation_stats[gen].visited += stats->visited; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this matters but what would happen if this overflows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we generally worried about overflowing Py_ssize_t?
I can switch this to size_t if we are so at least the overflow isn't UB. A Python int seems sort of heavy for this, but we could go that route if needed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we generally worried about overflowing
Py_ssize_t?
Nah, I think this has the same problem as the other fields just wanted to raise it here in case we both believe is worth fixing. I don't think is important, but it is possible. WDYT? I am fine to ignore just to be clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just check for overflow here and cap the value at PY_SSIZE_T_MAX to be paranoid about UB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm not worred about this (the fix looks like more trouble than it's worth). If we feel like this is a problem we can just make these fields unsigned in another PR.
Doc/library/gc.rst
Outdated
| "uncollectable": When *phase* is "stop", the number of objects | ||
| that could not be collected and were put in :data:`garbage`. | ||
|
|
||
| "visited": When *phase* is "stop", the number of unique objects visited |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing something but if I understand this correctly this counts objects in the collection set, not all objects visited during traversal (which would include objects referenced by containers). Consider maybe renaming to "objects_in_collection" or "initial_objects" or updating implementation to count all visited objects (including referenced ones during subtract_refs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, @nascheme pointed this out too on the issue. I think naming is hard... this can really be any of:
- Total visits performed (edges in the graph).
- Total unique objects visited (nodes in the graph, optionally minus immortals, marked-alive objects, untracked objects, etc.).
- Total objects in the current generation (nodes in the generation subgraph, optionally minus immortals, marked-alive objects, untracked objects, etc.).
I'm really trying not to overthink it... I think the probably most intuitive answer is something along the lines of "How many unique objects did the GC consider before eventually freeing <collected> number of them"? Like I said above, collected / visited should be a meaningful efficiency metric to help see at a glance if you're running at the right times (in addition to memory metrics, of course).
But definitely open to suggestions. I naturally lean towards a simpler API name and more nuance in the docs for it, rather than trying to convey all of the nuance in one long name alone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put another way, it seems like objects that can't count towards collected also shouldn't count towards visited. So I'm pretty sure that would exclude anything reachable from, but not part of, the current generation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree but even keeping the "naming is hard" optics, I would say that anything other than visited is good because at least we can agree that surely this is not "number of visited objects" no? Maybe candidates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let's do "candidates"!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if "objects_marked" is not a significant fraction of "candidates", there could be something wrong.
Clarification about this: that makes more sense when looking at the free-threaded GC. For the default build GC, the mark-alive operation is done as a separate step on the incremental collection. In that step, candidates == objects_marked. In other incremental steps, objects_marked == 0. If we change gc_collect_full() to run a mark-alive operation first, comparing those numbers will be useful for the default build GC too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, after thinking about it more I agree marked-alive should be included.
For the default build GC, I think you can just add the objects_marked count to the candidates total.
I think this could overcount, if we mark objects that aren't part of the current generation? I'll look into it a bit more.
I don't think this is useful and we should change it to not invoke the callback in that case.
Eh, I'm not sure I like this. I think it's reasonable to expect callbacks to be called exactly twice, and can easily imagine scenarios where a "start" call with no corresponding "stop" call could really screw things up (for example, a callback that sets gc.set_debug(gc.DEBUG_SAVEALL) on some tiny fraction of collections to log common sources of cycles).
Finally, perhaps we should include the "objects_marked" count in the info dict as well.
Sure, I can open another PR for this after (unless you wanted to take it). While we're at it, "untracked" seems useful, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, looking at the default GC implementation, I agree it's not super intuitive how to attribute marked-alive objects to individual increments.
Probably for the mark phase, we just populate candidates in the "stop" info, and ignore the marked objects in other collections. That way the aggregate stats still make sense... maybe this is what you were suggesting and I just misunderstood?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this is what I did (for the marking phase, dump the count of objects marked into the candidates stat).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's reasonable to expect callbacks to be called exactly twice
Yeah, I think that's more important than avoiding "useless" callbacks where work_to_do < 0. I was only looking a the phase == 'stop' calls.
Yeah, per-collection stats have an initializer here (here in free-threading), so unnamed members are zeroed. The process-wide stats live on the |
|
My choice of wording for the docs:
|
This exposes a counter for the number of unique objects seen in the current generation. Previously, something like
sum(len(gc.get_objects(g)) for g in range(gen + 1))was needed to get the a statistic like this (in this PR I've made the choice to collect this statistic inupdate_refs, and omit immortal but include marked members of the current generation).It's useful for things like
info["collected"] / info["candidates"], to get the efficiency of a collection.durationandvisitedto thegc.callbacksandgc.get_stats()dicts #140638📚 Documentation preview 📚: https://cpython-previews--141814.org.readthedocs.build/